Efficient Broadcasts and Simple Algorithms for Parallel Linear Algebra Computing in Clusters
نویسندگان
چکیده
This paper presents a natural and efficient implementation for the classical broadcast message passing routine which optimizes performance of Ethernet based clusters. A simple algorithm for parallel matrix multiplication is specifically designed to take advantage of both, parallel computing facilities (CPUs) provided by clusters, and optimized performance of broadcast messages on Ethernet based clusters. Also, this simple parallel algorithm proposed for matrix multiplication takes into account the possibly heterogeneous computing hardware and maintains a balanced workload of computers according to their relative computing power. Performance tests are presented on a heterogeneous cluster as well as on a homogeneous cluster, where it is compared with the parallel matrix multiplication provided by the ScaLAPACK library. Another simple parallel algorithm is proposed for LU matrix factorization (a general method to solve dense systems of equations) following the same guidelines used for the parallel matrix multiplication algorithm. Some performance tests are presented over a homogeneous cluster.
منابع مشابه
A mathematically simple method based on denition for computing eigenvalues, generalized eigenvalues and quadratic eigenvalues of matrices
In this paper, a fundamentally new method, based on the denition, is introduced for numerical computation of eigenvalues, generalized eigenvalues and quadratic eigenvalues of matrices. Some examples are provided to show the accuracy and reliability of the proposed method. It is shown that the proposed method gives other sequences than that of existing methods but they still are convergent to th...
متن کاملParallel Linear Algebra on Clusters
Parallel performance optimization is being applied and further improvements are studied for parallel linear algebra on clusters. Several parallelization guidelines have been defined and are being used on single clusters and local area networks used for parallel computing. In this context, some linear algebra parallel algorithms have been implemented following the parallelization guidelines, and...
متن کاملParallel computing using MPI and OpenMP on self-configured platform, UMZHPC.
Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...
متن کاملInvestigating the Effects of Hardware Parameters on Power Consumptions in SPMV Algorithms on Graphics Processing Units (GPUs)
Although Sparse matrix-vector multiplication (SPMVs) algorithms are simple, they include important parts of Linear Algebra algorithms in Mathematics and Physics areas. As these algorithms can be run in parallel, Graphics Processing Units (GPUs) has been considered as one of the best candidates to run these algorithms. In the recent years, power consumption has been considered as one of the metr...
متن کاملStatic Task Allocation in Distributed Systems Using Parallel Genetic Algorithm
Over the past two decades, PC speeds have increased from a few instructions per second to several million instructions per second. The tremendous speed of today's networks as well as the increasing need for high-performance systems has made researchers interested in parallel and distributed computing. The rapid growth of distributed systems has led to a variety of problems. Task allocation is a...
متن کامل